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1.0 INTRODUCTION 



The problem o^ -fitting a surface to small sets of given data 
has been addressed in many different ways and several computer 
programs are currently available which enable one to deal with 
the problem effectively- Many of the methods available involve a 
global interpolation or approximation scheme and often involves 
solving a system of equations with an equivalent number of 
unknowns. For very large sets of data^, the problem is 
computationally intractable. This consideration provides the 
motivation behind the development of a way to pare the problem 
down to a more manageable size- 

We wish to construct a function F which approximately fits 
the data since we assume the data collection is subject to 
measurement error- We propose to use approximation by least 
squares Thin Plate Splines (TPS), where the surface function is 
constructed so as to minimize an error function subject to 
cer tai n__c^onstrai nts. Solving the approximation problem will also 
involve as many equations as there are data points, but the 
number of unknowns will be significantly fewer. Part of the 
appeal of TPS approx i mati on lies in the fact that it minimizes a 
certain linear functional, and involves a linear combination of 
functions with no greater complexity than the natural logarithm 
of the distance function. 

Interpol at i on of scattered data by the method of TPS was 
developed from engineering considerations by Harder and Desmarais 
C5D- It can be thought of as a two dimensional generalization of 
the cubic spline, which models a thin beam under point loads 
subject to equilibrium constraints- The TPS function is derived 
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-from a di -f -f erent i al equation which gives the de-formation of an 
infinite, thin plate under the influence of point loads. A point 
load is applied at each data point so that the interpolating 
surface can be constructed as a sum of fundamental solutions of 
the TPS equation. 

In using the least squares TPS approximation method to fit 
the surface, a fewer number of basis functions than the number of 
given data points is employed. These basis functions are 
centered at a different, smaller set of points, which in analogy 
with the univariate case, we call the knots. Therefore, the 
problem at hand is one of selecting the knot points, and hence 
the basis functions. This approach differs from the use of 
smoothing splines, which were introduced by Wahba and 
Wendelberger C9U in the multidimensional case, and called 
Laplacian Smoothing Splines <LSS> . LSS minimize a certain 
functional which is a linear combination of a term measuring 
fidelity— to the data and one measuring smoothness of the function 
(a generalization of the usual thin plate spline functional). In 
this case there is still one basis function for each data point, 
but the interpolation condition is relaxed- 

Given a 'large' set of data points, ^ ~ 1,-..,N, 

we wish to find a smaller set of knot points, J “ 

1,...,K, which will 'represent' the former reasonably well. This 
could be accomplished by choosing a subset of the original set, 
or by some process which produces a representative set. The 
ultimate goal is to approximate the surface from which the origi- 
nal data arose using the representative set- Hence, a surface 
fit to the large set and one fit to the representati ve set should 
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be essentially the same- 



Approximation by least squares TPS is strai ghtf orward , once 
the knot points are known. We construct the TPS function 

2 

F(x,y) = Z A . d . "^1 og ( d . ) + ax + by + c 
j=i j j j 

where and the coe-f-f icients are chosen 

to minimize the error function _ 

N 

E = Z fCF(x.,y.) - . 

. 1 ’ ^ 1 1 1 

1 = 1 



The ordinates, f^, may be subject to random errors, say with 
standard deviation, s^ , at the i^^ data point. We model the 
plate under the point loads at the knot points (as opposed to the 
data points); therefore the constraint equations for the least 
squares TPS method, which may be thought of as 'equilibrium 
conditions' on the plate should be satisfied. Thus, the error 
function is minimized subject to the constraint equations: 



K- — 

E A = 0 

j = l 

« 

K 

E A 9 = 0 

j = l 

We use LINPACK C 1 H subroutines to do the actual calculations. 

Previous attempts have been made to minimize the error 
function by considering it to be a function of the knot point 
locations as well as the coefficients, wherein a total of 3K 
parameters are involved. As reported on by Schmidt C8], the 
initial knot configuration was taken to be of tensor product 
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^orm. The overall minimization process is a large non-linear 
one, and is complicated by possible coalescense oT knots as well 
as non-unique solutions (as indicated by consideration oT one- 
dimensional cases). Also, the objective -function may have many 
local minima so that avoiding poor local minima or searching Tor 
better local minima may be necessary. Because oT these kinds o-f 
problems, our goal is to decouple the knot selection proems -from 
the least squares process. 

When data are somewhat uniformly distributed, methods invol- 
ving tensor product cubic splines may be desirable. Tensor pro- 
duct methods place knot locations on a grid, which may not 
reflect the actual disposition of the data points; in Tact, 
there could be no data nearby. Furthermore, even though these 
problems are surmountable, they could lead to nonuniqueness oT 
solutions and a minimum norm solution that may not be 
aesthetically appealing. 

A cLLTTerent point oT view is considered here wherein the 
knot point locations are predetermined based on two criteria. 
Specifically, we shall make assumptions relating the density of 
data to the dependent variable and mandating the importance of 
each individual data point. Solution of the overdetermi ned sys- 
tem of equations follows the knot point selection. A summary of 
the approach and results will be presented. Examples are given 
which illustrate rather well the ability of the scheme to select 
knot locations which reflect the underlying density of the data. 
Actual surface fitting and comparison with two other methods, the 
Laplacian Smoothing Splines of Wahba and Wendelberger C93, and 
the tensor product bicubic Hermite method due to Foley C33 are 
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also reported on- 

We also point out two related ideas which are attempts to 
decrease the number ot basis functions to be considered. Each is 
an attempt to choose a subset of points to be used to construct 
the approximation. Schiro and Williams C7D used an adaptive 
process for subset selection and overlapping regions to construct 
underwater topographic maps- Bozzini, diTisi, and Lenarduzzi L21 
gave an algorithm for selecting a subset of points which were 
important to proper definition of the surface- Both of these 
methods made no assumptions about the density of the data points 
relative to the behavior of the surface, and required 
consi der ation of the ordinate values- 

2.0 THE KNOT SELECTION PROCESS 

Given 'a priori' flexibility in knot placement, the problem 
becomes the selection of knot locations, followed by solution of 
the system by least squares- Since the selection of knot 
locations is to be decoupled from the solution of the least 
squares problem, some assumptions must be made in order to devel- 
op an algorithm for the knot selection process- First, we 
assume that the independent variable data reflects something 
about the behavior of the dependent variable- For example, the 
density of the data points may be dependent on the curvature of 
the surface. Hence, where relatively many data points are found, 
the function is assumed to be changing behavior rapidly, whereas 
a low density of data indicates slowly changing behavior. 

A1 though this assumption is not universally satisfied in 
practice, it does not seem to be an unreasonable one- 



The second assumption is that each data point is equally 
important in defining the underlying surface. Therefore the 
number of data points represented by each knot should be the same 
or nearly the same. This leads to 'equal representati on ' of the 
data points by the knot points where each data point is 'close' 
to a knot point. A key advantage achieved in pursuing this 
approachr-is the existence of a natural heuristic for .moving the 
knots around the plane in searching for a good knot 
configuration. This point will be elaborated on later in the 
paper . 

□ur knot selection algorithm is based on these last two 
assumptions. First, we wish to minimize the sum of the distances 
squared from each data point to the nearest knot point; that is, 
minimize the 'global ' value, 

2 2 
GN = E min C(x.— j>.) + (y.— y.) 3 

i=l j ^ ^ ^ ^ 

This function is continuous and piecewise quadratic. The 
expression leads naturally to a 'default' Dirichlet Tesselation, 
a partitioning of the plane with respect to the knot points (see 
Figure 1). Thus, we say each data point belongs to some knot 
point according to the Dirichlet tile in which it lies. Data 
points on any of the tile boundaries (ties) must be resolved by a 
determination of which tile they belong to or some sharing 
mechanism. Our initial guess at the knot point configuration was 
taken to be quasi-gridded. 

Differentiation of GN"“ with respect to the and show 
that at the minimum, each knot point will occupy the centroid 
with respect to the data points inside that tile. Given the 



6 



initial con-figuration o-f knot points with its Dirichlet 
Tesselation, the -following algorithm for iteration to a local 
minimum GN^ value is employed: 

(a) compute the centroid o-f each tile with respect to the 
data points contained within each tile; 

(b) move the knots to the corresponding centroids, which 

results in a new Dirichlet Tesselation and a new set o^^ knot 
point - data point associations; this is the configuration for 
the next^.i teration- _ 

(c) quit when two successive iterations yield the s^me knot 

locations, which means that a local minimum value of has been 

found . 

This algorithm was formulated in discussions at the Istituto per 

le Applicazioni della Matematica e del 1 ' Inf ormati ca in 1983 C103, 

after the problem was posed by G- Nielson and R. Franke. 

We note that the value of GN"" will necessarily decrease as 

the iterations continue, until two successive iterations yield 

the same configuration; this will be proven below- In the case 

where no data points lie in a tile for some knot point, the knot 

point i^“moved to the nearest data point- This mechanism avoids 

knots without data points- Futhermore, if a data point lies on a 

tile boundary, it is assigned to the knot with the smallest 

subscript (amongst the appropriate choices of knot points) - 

Employment of a different criterion for the resolution of ties 

may yield different results- We note that knots cannot coalesce- 

The following theorem is pertinent to this algorithm- 

Theorem: The function GN"^ decreases with each iteration which 

involves movement of a knot point- 

Proof: Write GN^ in the more convenient form 



GN^ 



E E C (x.-5> 
j=l ^ ^ 



(y . -y . ) ^3 
1 



( 1 ) 
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where = Ci: (x.,y^) in the tile -for In <1), the 

interior sum is the sum o-f the distances squared -from the data 
points in a tile to the knot point in that tile, and the exterior 
sum is over all K of the tiles. Let a prime denote the new knot 
points and index sets. This form leads to the expressions 

<x',y') = < Z X . /n . , T. y-/n.>- 

ifc I . i€ I . 

J J 



where n^ is the number of indices in the set 1^. The new knot 
points will lead to a new tesselation, followed by the new index 
sets Ij. Then the expression (1) is greater than or equal to 

Z Z + (y.-y-)^] (2) 

j=l ici . ^ J ' J 

J 

because the new knot point locations minimize the contribution of 
the interior sums. This expression (2), in turn, is greater than 
or equal to 



Z E c (X. '. )^ + (y.-y*.)^: (3) 

j=l i€l^ ^ ^ ^ ^ 

since an index i moves to another set only in the case wherein 

the correspondi ng data point is now closer to a different knot 

2 

point, thus decreasing its contribution to the global GN value. 

% 

2 

Finding a local minimum of GN is well -served by this algo- 
rithm; however, as seen in a one dimensional example C6] , the 
2 

function GN is rife with local minima, and the local minimum 
value found depends on the initial conf iguration of knots used. 
We can draw similar conclusions for the multidimensional case 
based on the one dimensional analogy. 

The process of locating each knot occurs in two distinct 
steps: first, each data point is assigned to the closest knot 
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and second, a determination is made within each tile as to the 

location o^ its centroid. As a direct result o^ the centroid 

2 

requirement, the GN function will stabilize at the local minimum 
value correspond! ng to the particular quadratic piece on which 
the knot is ^ound. 

The local minimum value will -frequently occur out oi the 
domain o^ the corresponding quadratic piece- This leads to a 
'cascading' phenomenon which continues until a local minimum 
value is attained- However, the global minimum value will not 
necessarily be attained since the cascading will stop as soon as 
a local minimum value is tound within the domain of each 
quadratic piece- 

This inconsistent performance of the algorithm in finding 

2 

the global minimum value of GN leads to consideration of a 
somewhat different criterion for locating a good configuration of 
knot points- We wish to exploit the second assumption specified 

'~y 

ear 1 ier ,_whi le taking advantage of the minimization of the GN"^ 
function- Since each data point is assumed to be equally impor- 
tant, the Dirichlet tile for each knot should contain about the 
same number of data points- Thus, we wish to minimize the sum of 
the squares of the differences between the number of knots in 
each tile and the average number of data points that should 
belong to each tile; that is, minimize the quantity 

K 

D = E (n . - N/K)- 
j = l ^ 

The new algorithm -for determining knot locations is based on the 
minimization o-f D, subject to the constraint that each knot be 
located at the centroid o-f its tile. 
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This new optimization leads to a natural heuristic -for 
moving knots -from a stable con-figuration to a possibly better 
configuration- We call the current con-figuration o-f knots a 
'base' con-f i gurati on , and iterate through the algorithm as 
•f ol 1 ows: 

(a) generate a new guess -for the knot locations by moving 
the knat<s> with the smallest number o-f data points in their 
tile(s) toward the knot<s> with the largest number o-f knot<s> in 
their ti te <s> ; the distance moved is initally a large -fracTtion o-f 
the total distance between the knots. 

<b) iterate to a stable con-f iguration using the -first 
algorithm, compute the values o-f GN^ and D, and compare D to the 
smallest value achieved to date, as represented by that of the 
base configuration; 

(c) repeat the process above when a smaller value of D is ob- 
tained, with the present conf iguration as the base conf iguration; 

(d) when a smaller value of D is not found, take a shorter 
step in the movement of the knot(s) and repeat the process above; 

(e) continue with smaller and smaller steps until a smaller 
value of D is found (or an equal value of D with a smaller GN^ 
value) or until the knot locations return to the base 

conf iguration; 

(f) perform the search in the symmetrical way when the base 
conf igurati on is returned to; that is, move the knot(s) with the 
largest number of data points in their tiles toward the knot(s) 
with the smallest number of points in their tile; 

(g) quit when no smaller value of D is found. 

The movement of the knots is justified by the rationale that 

a more equitable distribution of data points can be found by 

moving the tile boundaries across data points. Note that the 

2 

algorithm for computing a local minimum of the GN function value 
is embedded in this new algorithm. 



3.0 RESULTS AND EXAMPLES 

Using our algorithm for the a priori selection of the knot 
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point locations, experiments were conducted to test the scheme 
using different sets o-f test data- This was -followed by 
veri-f ication o-f the scheme on a large set o-f real data. Results 
-from two sets of the test data are presented here: one 

consisting of 200 data points called 'Cliff', and one consisting 
of 500 data points called 'Humps and Dips'- Both sets of data 
were generated using known functions (see Franke C4H) in a way 
that forced the density of points to be proportional to the 
curvature of the sampled function. Figures 2—5 show these two 
test data sets graphically, and illustrate the optimized knot 
point conf igurations found using the least squares algorithm. 
Figure 6 depicts actual hydrographic data collected in Monterey 
Bay, with greatly varying density. Figure 7 shows the results of 
applying the algorithm with 100 knots, and is particularly 
interesting because the density of the data is faithfully 
replicated by the knots. We note that the assumption regarding 
the den_si_ty of the data points being indicative of the behavior 
of the dependent variable is not actually satisfied in this case, 
due to the source of the data. Nonetheless, these results 
demonstrate the ability of the algorithm to produce representa- 
tive sets of knots. 

We also investigated how closely the constructed surface F 
and the 'true' surface resemble one another. This comparison is 
made in the context of the root-mean— squared error (RMS) of the 
residuals (at the data points) and on a rectangular grid of 
locations in the plane- The two data sets constructed above, and 
one other which was generated in a similar manner, were used to 
compare RMS errors for the least squares algorithm developed here 



with the method of Mahba and Wendelberger (Laplacian Smoothing 
Splines), and the method oi Foley (Bicubic Hermite Tensor Product 
Spl i nes) . 

The dependent variable values of the experimental data sets 
were generated in two ways: 1) using a known function, and 2) 

contaminating the known function by the injection of independent, 
identically distributed normal random errors with a specified 
composite standard deviation of 0.05. The actual standard 
deviation was about 0.0485. In the first case, we would expect 
the RMS error on the data points and on the grid to be about the 
same, and to decrease as the number of knot points used to 
represent the data is increased. 

In the contaminated case, the dependent variable at each data 
point is the sum of the unknown underlying function value and the 
error function value so that the difference between the 
constructed surface and the 'true' surface is mainly attributable 
to the pjresence of error in the data. In the best situation, we 
expect the RMS error in the residuals to match the composite 
standard deviation of the random error injected to obtain the 
contaminated data- At the grid points, we expect the RMS error 
to be smaller than the composite standard deviation, since the 
grid sample is larger (33*33) and the errors are distributed more 
evenly throughout the entire region of interest. 

Some observations can be made regarding Tables 1—3. The 
general trend of the RMS error on both the data points and the 
grid is to diminish as the number of knot points is increased. 

As expected with the exact data, the RMS error of the residuals 
and the RMS error on the grid are roughly equivalent. For the 
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contaminated data, the RMS error o-f the residuals roughly matches 
the composite standard deviation of the data, and the RMS error 
on the grid is smaller than the RMS error of the residuals, as 
expected- In Table 1, the errors over the grid increase as the 
number of knot points is increased, and the RMS value of the 
residuals becomes less than the composite standard deviation of 
the injected errors- In this case, under smoothi ng has occurred, 
and the surface is “drawing** toward the error- 

In comparing least squares TPS to the smoothing spline 
method in the exact data case, we note that the smoothing spline 
method yields a residual RMS error of O- This could be expected, 
since there is no error in the data and the spline of 
interpolation is chosen- On the grid, the RMS error is small- 
When the data is not contaminated, the RMS error of the least 
squares algorithm only begins to become as small as that of the 
smoothing splines method when the number of knots used becomes 
large- We also note that in the 500 data point set, no 
comparison is made since a potential limit for computing 
smoothing splines is 200—300 data points- 

Foley's method for the contaminated case gives errors nearly 
equal to the composite standard deviation of the errors injected 
into the data- However, on the grid, the least squares method 
does better, an indication that smoothing is occurring, as 
expected- We also note that an increase in the number of grid 
points does not significantly improve the RMS error in Foley's 
method, even though an increase in the number of knots in the 
least squares method usually yields improved results- We used 
the default local approximations in Foley's method, and we note 
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that per-farmance o-f the method may be improved by using lower 
degree local approximations to estimate the grid values to be 
used . 

Finally, we note that the search -for a best knot con-figura- 
tion can turn out to be rather expensive. For a large number o-f 
data points with a moderately large number o-f knot points, the 
computational e-f-fort could be excessive, although we are 
investigating ways o-f speeding up the algorithm. Furthermore, as 
we noted earlier, the end results are dependent on the initial 
guess, although they generally look quite good -for any reasonable 
initial guess. 
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METHOD 


NUMBER OF 
DATA POINTS/ 
KNOT POINTS 


NO ERRORS 
IN DATA 

RESIDUAL GRID 


CONTAMINATED DATA 
RESIDUAL GRID 


LSTPS 


200/20 


0.01562 


0.01474 


0.05214 


0.01795 


LSTPS 


200/25 


0.01179 


0.01 154 


0 . 04805 


0.02040 


FOLEY 


200/5x5 


0.00777 


0.00613 


0.05996 


0.04819 


LSTPS 


200/35 


0.00626 


0.00616 


0.04590 


0.02146 


FOLEY 


200/6x6 


0.00512 


0.00417 


0.05113 


0.03745 


SMOOTHING 


200 


0.0 


0.00096 


0.04272 


0.01806 


Table 


1: Comparison 


of RMS errors on 


CLIFF' 200 points 


METHOD 


NUMBER OF 
DATA POINTS/ 
KNOT POINTS 


NO ERRORS 
IN DATA 

RESIDUAL GRID 


CONTAMINATED DATA 
RESIDUAL GRID 


LSTPS 


200/20 


0. 05525 


0.05465 


0.07571 


0.05866 


LSTPS 


200/25 


0.02520 


0.02646 


0.05603 


0.03385 


FOLEY 


200/5x5 


0. 01206 


0.01332 


0.04819 


0.04965 


LSTPS 


200/35 


0.01662 


0.01843 


0.05274 


0.02853 


FOLEY 


200/6x6 


0.00968 


0.01144 


0.05028 


0.03962 


SMOOTHING 


200 


0.0 


0.00254 


0.03900 


0. 02789 


Table 2: 


Comparison o-F 


RMS errors 


on 'HUMP 


AND DIPS' 200 


poi nts 
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METHOD 


NUMBER OF 
DATA POINTS/ 
KNOT POINTS 


NO ERRORS 
IN DATA 


CONTAMINATED 


DATA 






RESIDUAL 


GRID 


RESIDUAL 


GRID 


LSTPS 


500/20 


0.02402 


0.02517 


0.05256 


0.02738 


LSTPS 


500/25 


0.01664 


0.01766 


0.04818 


0.02283 


FOLEY 


500/5x5 


0.01346 


0.01230 


0.05844 


0.03767 


LSTPS 


500/50 


0.00645 


0.00845 


0.04544 


"0.01961 


FOLEY 


500/7x7 


0.00645 


0.00552 


0.05696 


0.04864 



Table 3: Comparison o-f RMS errors on 'HUMPS St DIPS' 500 points 
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Figures 4 ar^d 5: The 'Humps and Dips' data set. Note how 

clumps of data appear in three portions of the plarie, indicating 
that the uiider lying surface is undergoing change. A set of 50 
I not points was used to represent the data. 



C\J 




A 



eg 




A 



Oi 

L 

a 

O' 

H 

u. 



20 



Figures 6 and 7: Hydrographic Data From Monterey Bay, Ca- 

Here, 1669 data points (soundings) are represented by 100 knot 
points. The results are very reasonable. 
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